Sketch portrait generation benefits a wide range of applications such asdigital entertainment and law enforcement. Although plenty of efforts have beendedicated to this task, several issues still remain unsolved for generatingvivid and detail-preserving personal sketch portraits. For example, quite a fewartifacts may exist in synthesizing hairpins and glasses, and textural detailsmay be lost in the regions of hair or mustache. Moreover, the generalizationability of current systems is somewhat limited since they usually requireelaborately collecting a dictionary of examples or carefully tuningfeatures/components. In this paper, we present a novel representation learningframework that generates an end-to-end photo-sketch mapping through structureand texture decomposition. In the training stage, we first decompose the inputface photo into different components according to their representationalcontents (i.e., structural and textural parts) by using a pre-trainedConvolutional Neural Network (CNN). Then, we utilize a Branched FullyConvolutional Neural Network (BFCN) for learning structural and texturalrepresentations, respectively. In addition, we design a Sorted Matching MeanSquare Error (SM-MSE) metric to measure texture patterns in the loss function.In the stage of sketch rendering, our approach automatically generatesstructural and textural representations for the input photo and produces thefinal result via a probabilistic fusion scheme. Extensive experiments onseveral challenging benchmarks suggest that our approach outperformsexample-based synthesis algorithms in terms of both perceptual and objectivemetrics. In addition, the proposed method also has better generalizationability across dataset without additional training.
展开▼